3,400 research outputs found

    Improving Speech Recognition for Interviews with both Clean and Telephone Speech

    Get PDF
    High quality automatic speech recognition (ASR) depends on the context of the speech. Cleanly recorded speech has better results than speech recorded over telephone lines. In telephone speech, the signal is band-pass filtered which limits frequencies available for computation. Consequently, the transmitted speech signal may be distorted by noise, causing higher word error rates (WER). The main goal of this research project is to examine approaches to improve recognition of telephone speech while maintaining or improving results for clean speech in mixed telephone-clean speech recordings, by reducing mismatches between the test data and the available models. The test data includes recorded interviews where the interviewer was near the hand-held, single-channel recorder and the interviewee was on a speaker phone with the speaker near the recorder. Available resources include the Eesen offline transcriber and two acoustic models based on clean training data or telephone training data (Switchboard). The Eesen offline transcriber is on a virtual machine available through the Speech Recognition Virtual Kitchen and uses an approach based on a deep recurrent neural network acoustic model and a weighted finite state transducer decoder to transcribe audio into text. This project addresses the problem of high WER that comes when telephone speech is tested on cleanly-trained models by 1) replacing the clean model with a telephone model and 2) analyzing and addressing errors through data cleaning, correcting audio segmentation, and adding words to the dictionary. These approaches reduced the overall WER. This paper includes an overview of the transcriber, acoustic models, and the methods used to improve speech recognition, as well as results of transcription performance. We expect these approaches to reduce the WER on the telephone speech. Future work includes applying a variety of filters to the speech signal could reduce both additive and convolutional noise resulting from the telephone channel

    Complete Subdivision Algorithms, II: Isotopic Meshing of Singular Algebraic Curves

    Get PDF
    Given a real valued function f(X,Y), a box region B_0 in R^2 and a positive epsilon, we want to compute an epsilon-isotopic polygonal approximation to the restriction of the curve S=f^{-1}(0)={p in R^2: f(p)=0} to B_0. We focus on subdivision algorithms because of their adaptive complexity and ease of implementation. Plantinga and Vegter gave a numerical subdivision algorithm that is exact when the curve S is bounded and non-singular. They used a computational model that relied only on function evaluation and interval arithmetic. We generalize their algorithm to any bounded (but possibly non-simply connected) region that does not contain singularities of S. With this generalization as a subroutine, we provide a method to detect isolated algebraic singularities and their branching degree. This appears to be the first complete purely numerical method to compute isotopic approximations of algebraic curves with isolated singularities

    Sleepless in Seoul: `The Ant and the Metrohopper'

    Full text link
    One of Aesop's (La Fontain's) famous fables `The Ant and the Grasshopper' is widely known to give a moral lesson through comparison between the hard working ant and the party-loving grasshopper. Here we show a slightly different version of this fable, namely, "The Ant and the Metrohopper," which describes human mobility patterns in modern urban life. Numerous real transportation networks and the trajectory data have been studied in order to understand mobility patterns. We study trajectories of commuters on the public transportation of Metropolitan Seoul, Korea. Smart cards (Integrated Circuit Cards; ICCs) are used in the public transportation system, which allow collection of transit transaction data, including departure and arrival stations and time. This empirical analysis provides human mobility patterns, which impact traffic forecasting and transportation optimization, as well as urban planning.Comment: to be appeared in Journal of the Korean Physical Societ

    Statistical Analysis of the Metropolitan Seoul Subway System: Network Structure and Passenger Flows

    Full text link
    The Metropolitan Seoul Subway system, consisting of 380 stations, provides the major transportation mode in the metropolitan Seoul area. Focusing on the network structure, we analyze statistical properties and topological consequences of the subway system. We further study the passenger flows on the system, and find that the flow weight distribution exhibits a power-law behavior. In addition, the degree distribution of the spanning tree of the flows also follows a power law.Comment: 10 pages, 4 figure
    corecore